Week 1: Introduction to GIS

PPOL 6805 / DSAN 6750: GIS for Spatial Data Science
Fall 2024

Jeff Jacobs

jj1088@georgetown.edu

Wednesday, August 28, 2024

Welcome to The Wonderful World of GIS!

Your Final Project

Unit 1: Maps

  • Your least favorite part of the course (per survey 😜)
  • My favorite part of the course (because I love overthinking things)
  • My goal given survey results: Let’s think of this unit like learning languages for expressing spatial information:
Temporal Information Spatial Information
\(\Rightarrow\) 22.5 seconds \(\Rightarrow\) POLYGON ((2 1, 3 1, 5 2, 6 3, 5 3, 4 4, 3 4, 1 3, 2 1),(2 2, 3 3, 4 3, 4 2, 2 2))
  • I think you’ll be surprised at how, complexity of geospatial/spatio-temporal data \(\implies\) need for programming-language-independent representations

Unit 2: Using Code to Make Maps

  • (More on this in Prereqs section below!)
  • Given representations from Part 1, the task of coding becomes task of finding “best” library for loading/manipulating/plotting them
    • Where “best” = best for you!
  • In R: sf and friends (tidyverse)
  • In Python: geopandas

Unit 3: Spatial Data Science

  • Drawing inferences about spatial phenomena
  • The meat of the course
  • How can we write code (Unit 2) to analyze a map (Unit 1) so as to…
    • Discover patterns (EDA: Exploratory Data Analysis) or
    • Test hypotheses (CDA: Confirmatory Data Analysis)

Unit 4: Applications / Final Project

  • Take everything you’ve learned in Units 1-3 and Kamehameha them onto something you care about in the world!
  • Public Policy: Which counties are most in need of more transportation infrastructure?
  • Urban Planning: Which neighborhoods are most in need of a new bus stop?
  • Epidemiology: What properties of a region make it more/less susceptible to infectious diseases? Where should we intervene to “cut the chain” of a disease vector?

Who Am I? Why Am I Teaching You?

  • Started out as PhD student in Computer Science
    • UCLA: Algorithmic Game Theory
    • Stanford (MS): Economic Network Analysis
  • Ended up with PhD in Political Economy
    • Columbia: “Computational Political Theory”

My GIS Adventures

  • High school project: mine defusal in Indochina
  • As a Telecommunications Engineer for Huawei (HKUST)
  • As an Urban Economist at UC Berkeley

My GIS 🤯 Moment

From Robert (2016)

Huawei: Optimizing Cell Tower Placement

From LTE 4G/5G Self-Organizing Networks

The Suburbanization of Poverty

  • Since 2008, a person living in poverty in the US is more likely to be in a suburb than an “inner city”
  • What does this mean for…
    • Access to Food / Public Services?
    • Finding a job \(\leadsto\) Commuting?
  • My job: computing “suburban accessibility indices”
  • Does commuting = straight line distance?

“Distance” vs. Distance!

  • You’ve just been hired as a fine art curator at The Whitney… Congratulations!
Commuting 1 mile to the Whitney
Also commuting 1 mile to the Whitney

Why Should You Care About GIS?

  • As a Human
  • As a Data Scientist
  • As a Public Policy Expert

As Humans

  • To understand the world around you!

Charles Dupin, Carte figurative de l’instruction populaire de la France (1826)

As Data Scientists

  • All data scientists are expected to know how to analyze “standard” types of data: tabular, numeric data (think spreadsheets)
  • However, you can differentiate yourself in the scary scary job market by developing a particular focus on some “non-standard” type:

Hello Mr. Google Musk, yes, indeed, I have a wealth of experience working with [text data / temporal data / signal processing / geospatial data]. This job will be no problem for me.

As Public Policy Experts

  • Oftentimes, all it takes is one map to see why a policy has failed 😱

Who can guess what this map represents? (Source)

So… What Is GIS?

It’s Completely Made Up

Like, even more made up than other made-up technical terms… 😵‍💫

What I Mean By “Made Up”

  • The libraries and tools we’ll use are specific systems/methods for analyzing geospatial data
  • GIS is an “umbrella term”, which just vaguely refers to this entire universe of libraries/tools/techniques/approaches
Umbrella Term Concepts Specific Skills
Coding
  • Variables
  • Control Flow
  • Algorithms
  • Python
  • R
  • JavaScript
GIS
  • Projections
  • Vector vs. Raster
  • Spatial Data Formats (shapefiles, .geojson)
  • ArcGIS
  • GeoPandas (Python)
  • sf (R)

ArcGIS

  • For info on Georgetown’s provision of ArcGIS (Online, Pro, and Desktop), see the Library Guide

Ukraine Level-1 Administrative Regions Map (see CDTO talk)

Then… Why Can’t We Just Use ArcGIS?

Analogy from non-geospatial data science:

Text
Drawn Map
Speadsheet
Digital Map
Equations
Maps w/ArcGIS
Code
This Class
Start writing
info.txt
I gave Ana $3, then Ana paid me back $2. [...]
Realize there’s regularity/structure 🤔
Start entering info in rows
Fr To Amt Bal
Me Ana $3 -$3
Ana Me $2 -$1
Realize you’re manually computing things that could be automated 🤔
Start using equations
Fr To Amt Bal
Me Ana $3 =0-C1
Ana Me $2 =D1-C2
Realize you need fancier equations, and/or need to coordinate with inputs (APIs), outputs (plotting libraries) 🤔

Write code

plot_balance.py
import pandas as pd
df = pd.read_csv(...)
calc_weekly_balance()
df.plot()

Profit 💲💰🤑💰💲

The Spatial Data Science Universe

  • We’ll cover key “pieces”: GDAL (Geospatial Data Abstraction Library), PROJ for converting between projections, GEOS for computational geometry

Course Policy Things

  • How To Not Be Scared of Prerequisites
  • ChatGPT
  • Learning How To Learn

Pedagogical Principles

  • There’s literally no such thing as “intelligence”
  • Anyone is capable of learning anything (neural plasticity)
  • Growth mindset: “I can’t do this” \(\leadsto\) “I can’t do this yet!”
  • The point of a class is learning: understanding something about the world, either (a) For its own sake (end in itself) or (b) Because it’s relevant to something you care about (means to an end)

Our teaching should be governed, not by a desire to make students learn things, but by the endeavor to keep burning within them that light which is called curiosity. (Montessori 1916)

ChatGPT and Whatnot

  • If you feel like ChatGPT will help you learn something in the course, then use it!
  • If you feel like you’re using it as a “crutch”, try to hold yourself accountable for not using it!
Take the time/energy you're using to worry about... Use it instead to worry about...
  • ChatGPT
  • Collaboration Policies
  • Plagiarism
Learning GIS

I Am The Opposite of a Prereq-Stickler

  • I genuinely believe that I can make the course accessible to you, meeting you wherever you’re at, no matter what!
  • Everyone learns at their own pace (who says 14 weeks is “correct” amount of time to learn GIS?), and I structure my courses as best as I possibly can to adapt to your pace
  • \(\Rightarrow\) Assessments (HW, Midterm) valuable in two ways:
  • [Valuable for you] As an accountability mechanism to make sure you’re learn the material (how do we know when we’ve learned something? When we can answer questions about it / use it to accomplish things!)
  • [Valuable for me] For assessing and updating pace

R and/or Python and/or JS

  • My Geometry vs. Algebra Rant… Euclid’s Elements, Book VI, Proposition 28.
  • The problem: Divide a given straight line so that the rectangle contained by its segments may be equal to a given area, not exceeding the square of half the line.

Geometers solved w/geometry (300 BC)…

Algebraists solved w/algebra (2000 BC)…

\[ \begin{align*} &ax^2 + bx + c = 0 \\ \Rightarrow \; & x_+ = \frac{-b + \sqrt{b^2 - 4ac}}{2a} \end{align*} \]

From 1637 onwards, whichever is easier! 🤯🤯🤯 (Isomorphism)

Fig 1: Circle with radius 1? Or \((x,y)\) satisfying \(x^2 + y^2 = 1\)?

Learning How To Learn

He’s Entirely Correct!

From Elsevier Osmosis: Spaced Repetition

Let’s Make Some Dang Maps!

Our First Map: Polygon Data

Reading layer `Census_Tracts_in_2020' from data source 
  `/Users/jpj/gtown-local/ppol6805/w01/data/DC_Census_2020/Census_Tracts_in_2020.shp' 
  using driver `ESRI Shapefile'
Simple feature collection with 206 features and 315 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -8584933 ymin: 4691871 xmax: -8561515 ymax: 4721078
Projected CRS: WGS 84 / Pseudo-Mercator
[1] "sf"         "data.frame"
Simple feature collection with 6 features and 12 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -8577962 ymin: 4708107 xmax: -8572564 ymax: 4716136
Projected CRS: WGS 84 / Pseudo-Mercator
  OBJECTID  TRACT       GEOID  ALAND AWATER STUSAB SUMLEV     GEOCODE STATE
1        1 002002 11001002002 849376      0     DC    140 11001002002    11
2        2 002101 11001002101 600992      0     DC    140 11001002101    11
3        3 002102 11001002102 725975      0     DC    140 11001002102    11
4        4 002201 11001002201 415173      0     DC    140 11001002201    11
5        5 002202 11001002202 698895    566     DC    140 11001002202    11
6        6 000101 11001000101 199776   5261     DC    140 11001000101    11
                NAME POP100 HU100                       geometry
1 Census Tract 20.02   4072  1532 POLYGON ((-8575655 4714476,...
2 Census Tract 21.01   5687  2335 POLYGON ((-8574745 4715676,...
3 Census Tract 21.02   5099  2221 POLYGON ((-8573824 4715684,...
4 Census Tract 22.01   3485  1229 POLYGON ((-8574654 4714781,...
5 Census Tract 22.02   3339  1454 POLYGON ((-8573792 4714811,...
6  Census Tract 1.01   1406   999 POLYGON ((-8577962 4708867,...

sf Objects

Classes 'sf' and 'data.frame':  206 obs. of  316 variables:
 $ OBJECTID : int  1 2 3 4 5 6 7 8 9 10 ...
 $ TRACT    : chr  "002002" "002101" "002102" "002201" ...
 $ GEOID    : chr  "11001002002" "11001002101" "11001002102" "11001002201" ...
 $ P0010001 : int  4072 5687 5099 3485 3339 1406 3417 4108 4672 6161 ...
 $ P0010002 : int  3647 5071 4599 3138 2957 1259 3097 3697 4278 5553 ...
 $ P0010003 : int  1116 1037 901 973 619 1083 2783 2535 3573 4645 ...
 $ P0010004 : int  1751 2642 2910 1722 1636 47 57 348 193 260 ...
 $ P0010005 : int  27 86 50 21 48 5 3 9 11 16 ...
 $ P0010006 : int  84 120 87 94 57 92 176 624 399 488 ...
 $ P0010007 : int  0 1 0 2 3 1 2 6 0 6 ...
 $ P0010008 : int  669 1185 651 326 594 31 76 175 102 138 ...
 $ P0020002 : int  1022 1868 1074 683 1003 140 316 456 357 557 ...
 $ P0020005 : int  1033 885 805 864 548 1054 2711 2399 3492 4533 ...
 $ P0020006 : int  1722 2586 2856 1687 1594 47 56 339 184 248 ...
 $ P0020007 : int  4 20 14 3 8 4 0 4 3 11 ...
 $ P0020008 : int  83 115 83 85 54 90 176 621 396 482 ...
 $ P0020009 : int  0 1 0 2 0 1 1 6 0 5 ...
 $ P0020010 : int  21 39 40 14 22 3 22 29 25 26 ...
 $ P0030001 : int  3198 4375 3984 2755 2556 1288 2929 4079 4212 5081 ...
 $ P0030003 : int  855 827 741 742 496 1005 2427 2535 3236 3861 ...
 $ P0030004 : int  1505 2149 2340 1444 1310 39 53 346 182 223 ...
 $ P0030005 : int  24 58 44 13 38 2 3 9 4 16 ...
 $ P0030006 : int  71 100 79 75 57 90 161 624 380 433 ...
 $ P0030007 : int  0 1 0 2 3 1 0 6 0 6 ...
 $ P0030008 : int  470 816 433 248 393 27 72 174 95 101 ...
 $ P0040002 : int  701 1277 735 485 672 123 248 435 313 442 ...
 $ P0040005 : int  805 731 674 671 452 977 2365 2399 3164 3775 ...
 $ P0040006 : int  1490 2110 2300 1427 1285 39 53 337 173 211 ...
 $ P0040007 : int  4 16 13 1 8 2 0 4 3 11 ...
 $ P0040008 : int  70 95 75 73 54 88 161 621 377 427 ...
 $ P0040009 : int  0 1 0 2 0 1 0 6 0 5 ...
 $ P0040010 : int  8 25 28 11 17 3 20 29 18 17 ...
 $ H0010001 : int  1532 2335 2221 1229 1454 999 2053 11 2169 2845 ...
 $ H0010002 : int  1394 2107 1959 1141 1273 865 1724 11 1814 2634 ...
 $ H0010003 : int  138 228 262 88 181 134 329 0 355 211 ...
 $ ALAND    : int  849376 600992 725975 415173 698895 199776 1706484 505004 776435 1042157 ...
 $ AWATER   : int  0 0 0 0 566 5261 516665 0 439661 2305 ...
 $ STUSAB   : chr  "DC" "DC" "DC" "DC" ...
 $ SUMLEV   : int  140 140 140 140 140 140 140 140 140 140 ...
 $ GEOCODE  : chr  "11001002002" "11001002101" "11001002102" "11001002201" ...
 $ STATE    : int  11 11 11 11 11 11 11 11 11 11 ...
 $ NAME     : chr  "Census Tract 20.02" "Census Tract 21.01" "Census Tract 21.02" "Census Tract 22.01" ...
 $ POP100   : int  4072 5687 5099 3485 3339 1406 3417 4108 4672 6161 ...
 $ HU100    : int  1532 2335 2221 1229 1454 999 2053 11 2169 2845 ...
 $ P0010009 : int  425 616 500 347 382 147 320 411 394 608 ...
 $ P0010010 : int  392 572 451 310 356 135 304 383 380 571 ...
 $ P0010011 : int  61 42 69 40 44 17 14 62 31 46 ...
 $ P0010012 : int  33 11 11 8 8 8 25 8 15 29 ...
 $ P0010013 : int  38 37 41 50 21 22 62 127 100 143 ...
 $ P0010014 : int  0 0 2 0 0 2 2 0 0 2 ...
 $ P0010015 : int  183 339 189 165 212 86 187 158 216 322 ...
 $ P0010016 : int  15 30 23 4 13 0 1 0 5 6 ...
 $ P0010017 : int  13 19 9 1 5 0 7 5 7 5 ...
 $ P0010018 : int  4 0 0 0 3 0 0 2 2 0 ...
 $ P0010019 : int  38 68 73 37 38 0 0 10 1 14 ...
 $ P0010020 : int  0 0 2 0 0 0 1 0 0 0 ...
 $ P0010021 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010022 : int  0 23 27 1 12 0 1 4 3 4 ...
 $ P0010023 : int  1 2 1 2 0 0 3 0 0 0 ...
 $ P0010024 : int  4 1 2 2 0 0 1 6 0 0 ...
 $ P0010025 : int  2 0 2 0 0 0 0 1 0 0 ...
 $ P0010026 : int  32 41 41 33 14 11 16 25 9 32 ...
 $ P0010027 : int  19 10 6 6 2 2 3 0 1 3 ...
 $ P0010028 : int  1 4 6 0 5 1 0 6 2 1 ...
 $ P0010029 : int  0 0 0 0 0 1 0 0 0 0 ...
 $ P0010030 : int  2 4 12 6 1 0 5 9 4 10 ...
 $ P0010031 : int  0 1 0 0 1 0 0 0 0 1 ...
 $ P0010032 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010033 : int  7 17 14 10 2 4 2 4 0 10 ...
 $ P0010034 : int  0 0 1 1 0 1 2 1 0 0 ...
 $ P0010035 : int  1 4 2 3 0 1 2 2 1 4 ...
 $ P0010036 : int  0 0 0 0 0 0 1 2 0 0 ...
 $ P0010037 : int  2 0 0 0 0 0 1 0 0 1 ...
 $ P0010038 : int  0 0 0 0 0 0 0 0 0 1 ...
 $ P0010039 : int  0 0 0 3 3 0 0 1 0 1 ...
 $ P0010040 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010041 : int  0 0 0 0 0 1 0 0 1 0 ...
 $ P0010042 : int  0 1 0 1 0 0 0 0 0 0 ...
 $ P0010043 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010044 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010045 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010046 : int  0 0 0 3 0 0 0 0 0 0 ...
 $ P0010047 : int  1 3 8 3 12 1 0 3 4 4 ...
 $ P0010048 : int  0 1 6 0 1 0 0 1 1 0 ...
 $ P0010049 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010050 : int  0 1 2 3 11 1 0 2 3 2 ...
 $ P0010051 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010052 : int  0 0 0 0 0 0 0 0 0 1 ...
 $ P0010053 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010054 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010055 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010056 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010057 : int  0 1 0 0 0 0 0 0 0 0 ...
 $ P0010058 : int  0 0 0 0 0 0 0 0 0 1 ...
 $ P0010059 : int  1 0 0 0 0 0 0 0 0 0 ...
 $ P0010060 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010061 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010062 : int  0 0 0 0 0 0 0 0 0 0 ...
 $ P0010063 : int  0 0 0 1 0 0 0 0 0 1 ...
  [list output truncated]
 - attr(*, "sf_column")= chr "geometry"
 - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
  ..- attr(*, "names")= chr [1:315] "OBJECTID" "TRACT" "GEOID" "P0010001" ...
Simple feature collection with 6 features and 315 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -8577962 ymin: 4708107 xmax: -8572564 ymax: 4716136
Projected CRS: WGS 84 / Pseudo-Mercator
  OBJECTID  TRACT       GEOID P0010001 P0010002 P0010003 P0010004 P0010005
1        1 002002 11001002002     4072     3647     1116     1751       27
2        2 002101 11001002101     5687     5071     1037     2642       86
3        3 002102 11001002102     5099     4599      901     2910       50
4        4 002201 11001002201     3485     3138      973     1722       21
5        5 002202 11001002202     3339     2957      619     1636       48
6        6 000101 11001000101     1406     1259     1083       47        5
  P0010006 P0010007 P0010008 P0020002 P0020005 P0020006 P0020007 P0020008
1       84        0      669     1022     1033     1722        4       83
2      120        1     1185     1868      885     2586       20      115
3       87        0      651     1074      805     2856       14       83
4       94        2      326      683      864     1687        3       85
5       57        3      594     1003      548     1594        8       54
6       92        1       31      140     1054       47        4       90
  P0020009 P0020010 P0030001 P0030003 P0030004 P0030005 P0030006 P0030007
1        0       21     3198      855     1505       24       71        0
2        1       39     4375      827     2149       58      100        1
3        0       40     3984      741     2340       44       79        0
4        2       14     2755      742     1444       13       75        2
5        0       22     2556      496     1310       38       57        3
6        1        3     1288     1005       39        2       90        1
  P0030008 P0040002 P0040005 P0040006 P0040007 P0040008 P0040009 P0040010
1      470      701      805     1490        4       70        0        8
2      816     1277      731     2110       16       95        1       25
3      433      735      674     2300       13       75        0       28
4      248      485      671     1427        1       73        2       11
5      393      672      452     1285        8       54        0       17
6       27      123      977       39        2       88        1        3
  H0010001 H0010002 H0010003  ALAND AWATER STUSAB SUMLEV     GEOCODE STATE
1     1532     1394      138 849376      0     DC    140 11001002002    11
2     2335     2107      228 600992      0     DC    140 11001002101    11
3     2221     1959      262 725975      0     DC    140 11001002102    11
4     1229     1141       88 415173      0     DC    140 11001002201    11
5     1454     1273      181 698895    566     DC    140 11001002202    11
6      999      865      134 199776   5261     DC    140 11001000101    11
                NAME POP100 HU100 P0010009 P0010010 P0010011 P0010012 P0010013
1 Census Tract 20.02   4072  1532      425      392       61       33       38
2 Census Tract 21.01   5687  2335      616      572       42       11       37
3 Census Tract 21.02   5099  2221      500      451       69       11       41
4 Census Tract 22.01   3485  1229      347      310       40        8       50
5 Census Tract 22.02   3339  1454      382      356       44        8       21
6  Census Tract 1.01   1406   999      147      135       17        8       22
  P0010014 P0010015 P0010016 P0010017 P0010018 P0010019 P0010020 P0010021
1        0      183       15       13        4       38        0        0
2        0      339       30       19        0       68        0        0
3        2      189       23        9        0       73        2        0
4        0      165        4        1        0       37        0        0
5        0      212       13        5        3       38        0        0
6        2       86        0        0        0        0        0        0
  P0010022 P0010023 P0010024 P0010025 P0010026 P0010027 P0010028 P0010029
1        0        1        4        2       32       19        1        0
2       23        2        1        0       41       10        4        0
3       27        1        2        2       41        6        6        0
4        1        2        2        0       33        6        0        0
5       12        0        0        0       14        2        5        0
6        0        0        0        0       11        2        1        1
  P0010030 P0010031 P0010032 P0010033 P0010034 P0010035 P0010036 P0010037
1        2        0        0        7        0        1        0        2
2        4        1        0       17        0        4        0        0
3       12        0        0       14        1        2        0        0
4        6        0        0       10        1        3        0        0
5        1        1        0        2        0        0        0        0
6        0        0        0        4        1        1        0        0
  P0010038 P0010039 P0010040 P0010041 P0010042 P0010043 P0010044 P0010045
1        0        0        0        0        0        0        0        0
2        0        0        0        0        1        0        0        0
3        0        0        0        0        0        0        0        0
4        0        3        0        0        1        0        0        0
5        0        3        0        0        0        0        0        0
6        0        0        0        1        0        0        0        0
  P0010046 P0010047 P0010048 P0010049 P0010050 P0010051 P0010052 P0010053
1        0        1        0        0        0        0        0        0
2        0        3        1        0        1        0        0        0
3        0        8        6        0        2        0        0        0
4        3        3        0        0        3        0        0        0
5        0       12        1        0       11        0        0        0
6        0        1        0        0        1        0        0        0
  P0010054 P0010055 P0010056 P0010057 P0010058 P0010059 P0010060 P0010061
1        0        0        0        0        0        1        0        0
2        0        0        0        1        0        0        0        0
3        0        0        0        0        0        0        0        0
4        0        0        0        0        0        0        0        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0010062 P0010063 P0010064 P0010065 P0010066 P0010067 P0010068 P0010069
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        0        0        0        0        0
4        0        1        1        0        0        0        0        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0010070 P0010071 P0020001 P0020003 P0020004 P0020011 P0020012 P0020013
1        0        0     4072     3050     2863      187      169       59
2        0        0     5687     3819     3646      173      160       39
3        0        0     5099     4025     3798      227      197       66
4        0        0     3485     2802     2655      147      131       40
5        0        0     3339     2336     2226      110      100       41
6        0        0     1406     1266     1199       67       61       16
  P0020014 P0020015 P0020016 P0020017 P0020018 P0020019 P0020020 P0020021
1        9       36        0       16       15       13        4       16
2        8       35        0       11       29       14        0       22
3        4       39        2       16       23        9        0       33
4        3       49        0       18        4        1        0       12
5        2       21        0        6       13        3        3        9
6        5       22        1       17        0        0        0        0
  P0020022 P0020023 P0020024 P0020025 P0020026 P0020027 P0020028 P0020029
1        0        0        0        1        0        0       18       14
2        0        0        0        2        0        0       13        6
3        2        0        0        1        2        0       23        6
4        0        0        0        2        2        0       15        6
5        0        0        2        0        0        0        8        1
6        0        0        0        0        0        0        5        2
  P0020030 P0020031 P0020032 P0020033 P0020034 P0020035 P0020036 P0020037
1        1        0        0        0        0        1        0        0
2        4        0        1        0        0        0        0        1
3        6        0       10        0        0        0        1        0
4        0        0        5        0        0        0        1        2
5        5        0        1        1        0        0        0        0
6        1        1        0        0        0        0        1        0
  P0020038 P0020039 P0020040 P0020041 P0020042 P0020043 P0020044 P0020045
1        0        2        0        0        0        0        0        0
2        0        0        0        0        0        0        1        0
3        0        0        0        0        0        0        0        0
4        0        0        0        0        0        0        1        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0020046 P0020047 P0020048 P0020049 P0020050 P0020051 P0020052 P0020053
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        7        5        0        2        0
4        0        0        0        1        0        0        1        0
5        0        0        0        2        1        0        1        0
6        0        0        0        1        0        0        1        0
  P0020054 P0020055 P0020056 P0020057 P0020058 P0020059 P0020060 P0020061
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        0        0        0        0        0
4        0        0        0        0        0        0        0        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0020062 P0020063 P0020064 P0020065 P0020066 P0020067 P0020068 P0020069
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        0        0        0        0        0
4        0        0        0        0        0        0        0        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0020070 P0020071 P0020072 P0020073 P0030002 P0030009 P0030010 P0030011
1        0        0        0        0     2925      273      245       36
2        0        0        0        0     3951      424      392       23
3        0        0        0        0     3637      347      307       38
4        0        0        0        0     2524      231      201       17
5        0        0        0        0     2297      259      242       23
6        0        0        0        0     1164      124      116       17
  P0030012 P0030013 P0030014 P0030015 P0030016 P0030017 P0030018 P0030019
1       15       17        0      124        8        6        4       28
2        7       21        0      238       18       14        0       57
3       10       27        2      124       17        5        0       56
4        7       23        0      122        0        1        0       26
5        4       14        0      142        9        2        3       33
6        8       11        2       78        0        0        0        0
  P0030020 P0030021 P0030022 P0030023 P0030024 P0030025 P0030026 P0030027
1        0        0        0        1        4        2       27       16
2        0        0       11        2        1        0       30        8
3        2        0       22        0        2        2       32        6
4        0        0        1        2        2        0       26        6
5        0        0       12        0        0        0        7        1
6        0        0        0        0        0        0        8        2
  P0030028 P0030029 P0030030 P0030031 P0030032 P0030033 P0030034 P0030035
1        1        0        0        0        0        7        0        1
2        4        0        1        0        0       16        0        1
3        5        0       11        0        0        7        1        2
4        0        0        5        0        0       10        1        1
5        0        0        0        1        0        2        0        0
6        1        1        0        0        0        3        1        0
  P0030036 P0030037 P0030038 P0030039 P0030040 P0030041 P0030042 P0030043
1        0        2        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        0        0        0        0        0
4        0        0        0        0        0        0        0        0
5        0        0        0        3        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0030044 P0030045 P0030046 P0030047 P0030048 P0030049 P0030050 P0030051
1        0        0        0        1        0        0        0        0
2        0        0        0        2        0        0        1        0
3        0        0        0        8        6        0        2        0
4        0        0        3        3        0        0        3        0
5        0        0        0       10        0        0       10        0
6        0        0        0        0        0        0        0        0
  P0030052 P0030053 P0030054 P0030055 P0030056 P0030057 P0030058 P0030059
1        0        0        0        0        0        0        0        1
2        0        0        0        0        0        1        0        0
3        0        0        0        0        0        0        0        0
4        0        0        0        0        0        0        0        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0030060 P0030061 P0030062 P0030063 P0030064 P0030065 P0030066 P0030067
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        0        0        0        0        0
4        0        0        0        1        1        0        0        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0030068 P0030069 P0030070 P0030071 P0040001 P0040003 P0040004 P0040011
1        0        0        0        0     3198     2497     2377      120
2        0        0        0        0     4375     3098     2978      120
3        0        0        0        0     3984     3249     3090      159
4        0        0        0        0     2755     2270     2185       85
5        0        0        0        0     2556     1884     1816       68
6        0        0        0        0     1288     1165     1110       55
  P0040012 P0040013 P0040014 P0040015 P0040016 P0040017 P0040018 P0040019
1      105       36        5       17        0       16        8        6
2      109       23        5       20        0       11       17       14
3      130       38        4       27        2        9       17        5
4       71       17        3       22        0       15        0        1
5       66       23        2       14        0        6        9        0
6       50       16        5       11        1       17        0        0
  P0040020 P0040021 P0040022 P0040023 P0040024 P0040025 P0040026 P0040027
1        4       12        0        0        0        1        0        0
2        0       17        0        0        0        2        0        0
3        0       24        2        0        0        0        2        0
4        0        9        0        0        0        2        2        0
5        3        7        0        0        2        0        0        0
6        0        0        0        0        0        0        0        0
  P0040028 P0040029 P0040030 P0040031 P0040032 P0040033 P0040034 P0040035
1       15       11        1        0        0        0        0        1
2       11        5        4        0        1        0        0        0
3       22        6        5        0       10        0        0        0
4       13        6        0        0        5        0        0        0
5        1        0        0        0        0        1        0        0
6        5        2        1        1        0        0        0        0
  P0040036 P0040037 P0040038 P0040039 P0040040 P0040041 P0040042 P0040043
1        0        0        0        2        0        0        0        0
2        0        1        0        0        0        0        0        0
3        1        0        0        0        0        0        0        0
4        1        1        0        0        0        0        0        0
5        0        0        0        0        0        0        0        0
6        1        0        0        0        0        0        0        0
  P0040044 P0040045 P0040046 P0040047 P0040048 P0040049 P0040050 P0040051
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        0        0        7        5        0
4        0        0        0        0        0        1        0        0
5        0        0        0        0        0        1        0        0
6        0        0        0        0        0        0        0        0
  P0040052 P0040053 P0040054 P0040055 P0040056 P0040057 P0040058 P0040059
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        2        0        0        0        0        0        0        0
4        1        0        0        0        0        0        0        0
5        1        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0040060 P0040061 P0040062 P0040063 P0040064 P0040065 P0040066 P0040067
1        0        0        0        0        0        0        0        0
2        0        0        0        0        0        0        0        0
3        0        0        0        0        0        0        0        0
4        0        0        0        0        0        0        0        0
5        0        0        0        0        0        0        0        0
6        0        0        0        0        0        0        0        0
  P0040068 P0040069 P0040070 P0040071 P0040072 P0040073 P0050001 P0050002
1        0        0        0        0        0        0       32        0
2        0        0        0        0        0        0       31        0
3        0        0        0        0        0        0      145        0
4        0        0        0        0        0        0        6        6
5        0        0        0        0        0        0        4        0
6        0        0        0        0        0        0        0        0
  P0050003 P0050004 P0050005 P0050006 P0050007 P0050008 P0050009 P0050010
1        0        0        0        0       32        0        0       32
2        0        0        0        0       31        0        0       31
3        0        0        0        0      145        0        0      145
4        0        6        0        0        0        0        0        0
5        0        0        0        0        4        0        0        4
6        0        0        0        0        0        0        0        0
  SHAPEAREA SHAPELEN                       geometry
1         0        0 POLYGON ((-8575655 4714476,...
2         0        0 POLYGON ((-8574745 4715676,...
3         0        0 POLYGON ((-8573824 4715684,...
4         0        0 POLYGON ((-8574654 4714781,...
5         0        0 POLYGON ((-8573792 4714811,...
6         0        0 POLYGON ((-8577962 4708867,...
[1] 206 316
Simple feature collection with 1 feature and 315 fields
Geometry type: POLYGON
Dimension:     XY
Bounding box:  xmin: -8575656 ymin: 4713958 xmax: -8574562 ymax: 4716136
Projected CRS: WGS 84 / Pseudo-Mercator
  OBJECTID  TRACT       GEOID P0010001 P0010002 P0010003 P0010004 P0010005
1        1 002002 11001002002     4072     3647     1116     1751       27
  P0010006 P0010007 P0010008 P0020002 P0020005 P0020006 P0020007 P0020008
1       84        0      669     1022     1033     1722        4       83
  P0020009 P0020010 P0030001 P0030003 P0030004 P0030005 P0030006 P0030007
1        0       21     3198      855     1505       24       71        0
  P0030008 P0040002 P0040005 P0040006 P0040007 P0040008 P0040009 P0040010
1      470      701      805     1490        4       70        0        8
  H0010001 H0010002 H0010003  ALAND AWATER STUSAB SUMLEV     GEOCODE STATE
1     1532     1394      138 849376      0     DC    140 11001002002    11
                NAME POP100 HU100 P0010009 P0010010 P0010011 P0010012 P0010013
1 Census Tract 20.02   4072  1532      425      392       61       33       38
  P0010014 P0010015 P0010016 P0010017 P0010018 P0010019 P0010020 P0010021
1        0      183       15       13        4       38        0        0
  P0010022 P0010023 P0010024 P0010025 P0010026 P0010027 P0010028 P0010029
1        0        1        4        2       32       19        1        0
  P0010030 P0010031 P0010032 P0010033 P0010034 P0010035 P0010036 P0010037
1        2        0        0        7        0        1        0        2
  P0010038 P0010039 P0010040 P0010041 P0010042 P0010043 P0010044 P0010045
1        0        0        0        0        0        0        0        0
  P0010046 P0010047 P0010048 P0010049 P0010050 P0010051 P0010052 P0010053
1        0        1        0        0        0        0        0        0
  P0010054 P0010055 P0010056 P0010057 P0010058 P0010059 P0010060 P0010061
1        0        0        0        0        0        1        0        0
  P0010062 P0010063 P0010064 P0010065 P0010066 P0010067 P0010068 P0010069
1        0        0        0        0        0        0        0        0
  P0010070 P0010071 P0020001 P0020003 P0020004 P0020011 P0020012 P0020013
1        0        0     4072     3050     2863      187      169       59
  P0020014 P0020015 P0020016 P0020017 P0020018 P0020019 P0020020 P0020021
1        9       36        0       16       15       13        4       16
  P0020022 P0020023 P0020024 P0020025 P0020026 P0020027 P0020028 P0020029
1        0        0        0        1        0        0       18       14
  P0020030 P0020031 P0020032 P0020033 P0020034 P0020035 P0020036 P0020037
1        1        0        0        0        0        1        0        0
  P0020038 P0020039 P0020040 P0020041 P0020042 P0020043 P0020044 P0020045
1        0        2        0        0        0        0        0        0
  P0020046 P0020047 P0020048 P0020049 P0020050 P0020051 P0020052 P0020053
1        0        0        0        0        0        0        0        0
  P0020054 P0020055 P0020056 P0020057 P0020058 P0020059 P0020060 P0020061
1        0        0        0        0        0        0        0        0
  P0020062 P0020063 P0020064 P0020065 P0020066 P0020067 P0020068 P0020069
1        0        0        0        0        0        0        0        0
  P0020070 P0020071 P0020072 P0020073 P0030002 P0030009 P0030010 P0030011
1        0        0        0        0     2925      273      245       36
  P0030012 P0030013 P0030014 P0030015 P0030016 P0030017 P0030018 P0030019
1       15       17        0      124        8        6        4       28
  P0030020 P0030021 P0030022 P0030023 P0030024 P0030025 P0030026 P0030027
1        0        0        0        1        4        2       27       16
  P0030028 P0030029 P0030030 P0030031 P0030032 P0030033 P0030034 P0030035
1        1        0        0        0        0        7        0        1
  P0030036 P0030037 P0030038 P0030039 P0030040 P0030041 P0030042 P0030043
1        0        2        0        0        0        0        0        0
  P0030044 P0030045 P0030046 P0030047 P0030048 P0030049 P0030050 P0030051
1        0        0        0        1        0        0        0        0
  P0030052 P0030053 P0030054 P0030055 P0030056 P0030057 P0030058 P0030059
1        0        0        0        0        0        0        0        1
  P0030060 P0030061 P0030062 P0030063 P0030064 P0030065 P0030066 P0030067
1        0        0        0        0        0        0        0        0
  P0030068 P0030069 P0030070 P0030071 P0040001 P0040003 P0040004 P0040011
1        0        0        0        0     3198     2497     2377      120
  P0040012 P0040013 P0040014 P0040015 P0040016 P0040017 P0040018 P0040019
1      105       36        5       17        0       16        8        6
  P0040020 P0040021 P0040022 P0040023 P0040024 P0040025 P0040026 P0040027
1        4       12        0        0        0        1        0        0
  P0040028 P0040029 P0040030 P0040031 P0040032 P0040033 P0040034 P0040035
1       15       11        1        0        0        0        0        1
  P0040036 P0040037 P0040038 P0040039 P0040040 P0040041 P0040042 P0040043
1        0        0        0        2        0        0        0        0
  P0040044 P0040045 P0040046 P0040047 P0040048 P0040049 P0040050 P0040051
1        0        0        0        0        0        0        0        0
  P0040052 P0040053 P0040054 P0040055 P0040056 P0040057 P0040058 P0040059
1        0        0        0        0        0        0        0        0
  P0040060 P0040061 P0040062 P0040063 P0040064 P0040065 P0040066 P0040067
1        0        0        0        0        0        0        0        0
  P0040068 P0040069 P0040070 P0040071 P0040072 P0040073 P0050001 P0050002
1        0        0        0        0        0        0       32        0
  P0050003 P0050004 P0050005 P0050006 P0050007 P0050008 P0050009 P0050010
1        0        0        0        0       32        0        0       32
  SHAPEAREA SHAPELEN                       geometry
1         0        0 POLYGON ((-8575655 4714476,...

Raster Data

So far, we have considered point, line, and polygon data, all of which fall under the umbrella of vector data types. Rasters are a distinct GIS data type that we will consider only briefly because they cannot be handled with sf methods. We will look at the volcano dataset, which gives topographic information for Maunga Whau (a volcano located in Auckland, New Zealand) on a 10m by 10m grid. Because it is a relatively small raster, we can handle volcano using base functions. Larger rasters should be handled using the raster package.

[1] "matrix" "array" 
 num [1:87, 1:61] 100 101 102 103 104 105 105 106 107 108 ...

NYS Lyme incidence data

We will use the example of New York State county-aggregated Lyme disease incidence for 2014-2016 to try our hand at spatial analysis. This data is publicly available and can be accessed at the Health Data NY website. Raw data can be downloaded in .csv format. If you’re curious to see how this tabular data can be merged with a New York State county shapefile (available at NYS GIS), you can see how this is done in the ‘prep_nys_lyme_data.R’ script file located in the ‘data’ folder of our project directory. But for now, we’ll start with this data that has already been merged and converted to an ‘sf’ object.

[1] "sf"         "data.frame"

This data is an example of regional rate data. An easy way to map this data is using the ‘tmap’ library. Let’s load ‘tmap’ and create an interactive map with just a few lines of code.

We have a missing value in St. Lawrence county. Let’s remove this row from the data so it doesn’t throw an error later in our analysis.

Global clustering (Moran’s I)

This section was adapted from a tutorial created by Manuel Gimmond, which can be found on his github page.

Assess data distribution

Let’s begin by looking at the distribution of our Lyme incidence rate data. The Moran’s I statistic is not robust to extreme values or outliers so we will need to transform our data if it deviates greatly from a normal distribution.

   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
   1.90   13.50   36.30   80.27   98.50  599.60 

Our data is skewed strongly to the right with lots of outliers much greater than the mean. Let’s see if a log transformation can make our data look more normal.

That’s much better! We can see what our log-transformed values look like on a map.

References

Montessori, Maria. 1916. Spontaneous Activity in Education: A Basic Guide to the Montessori Methods of Learning in the Classroom. Lulu Press.
Robert, Amélie. 2016. “At the Heart of the Vietnam War: Herbicides, Napalm and Bulldozers Against the A Lưới Mountains.” Journal of Alpine Research Revue de géographie Alpine, no. 104-1 (April). https://doi.org/10.4000/rga.3266.